Germline-somatic JAK2 interactions are associated with clonal expansion in myelofibrosis

Myelofibrosis is a rare myeloproliferative neoplasm (MPN) with high risk for progression to acute myeloid leukemia. Our integrated genomic analysis of up to 933 myelofibrosis cases identifies 6 germline susceptibility loci, 4 of which overlap with previously identified MPN loci. Virtual karyotyping identifies high frequencies of mosaic chromosomal alterations (mCAs), with enrichment at myelofibrosis GWAS susceptibility loci and recurrently somatically mutated MPN genes (e.g., JAK2). We replicate prior MPN associations showing germline variation at the 9p24.1 risk haplotype confers elevated risk of acquiring JAK2V617F mutations, demonstrating with long-read sequencing that this relationship occurs in cis. We also describe recurrent 9p24.1 large mCAs that selectively retained JAK2V617F mutations. Germline variation associated with longer telomeres is associated with increased myelofibrosis risk. Myelofibrosis cases with high-frequency JAK2 mCAs have marked reductions in measured telomere length – suggesting a relationship between telomere biology and myelofibrosis clonal expansion. Our results advance understanding of the germline-somatic interaction at JAK2 and implicate mCAs involving JAK2 as strong promoters of clonal expansion of those mutated clones.

In the present work authors perform an integrated genomic analysis on a wide cohort of primary and secondary myelofibrosis (PMF and SMF) patients in order to characterize the heritable genetic component of these neoplasms. As a result, authors identify six novel germline susceptibility loci, and among them two variants emerged, the first in 9p24.1 locus within an intron of the frequently mutated JAK2 gene, and the second located within an intron of TERT gene. Authors show a correlation between the previously described JAK2 46/1 risk haplotype and the occurrence of JAK2V617F mutation or mosaic chromosomal alterations involving 9p24.1. According to their hypothesis, the presence of both germline variants, together with the acquisition of JAK2V617F mutation and somatic mosaic chromosomal alterations, contributes to the increased JAK2 activity and clonal expansion as demonstrated by the reduction of telomere length. The presented results are significant and original, but I have some concerns about data interpretation.
Major points 1. Myelofibrosis (MF) belongs to chronic Philadelphia-Negative Myeloproliferative Neoplasms (MPNs); it is generally assumed that the malignant transformation involves the myeloid lineage while lymphoid cells do not belong to the neoplastic clone, indeed in most cases these cells do not harbor the driver mutation observed in patients. Authors performed their analysis by studying whole blood specimens or PBMNCs therefore including in their analysis both neoplastic myeloid cells belonging to the expanded malignant clone (granulocytes, monocytes and circulating CD34+cells) and lymphoid cells probably representing a non-malignant cell fraction. Does the presence of both clonal and non-clonal cells in the analyzed sample might influence analysis results? Can the variable frequency of neoplastic cells in the analyzed sample influence analysis results? In such a complex and variable scenario how were germline variants distinguished from somatic ones? 2. The present study is focused on MF that can be either primary or secondary, when it develops in patients who have a previous diagnosis of polycythemia vera (PV) or essential thrombocythemia (ET). Even if these two conditions (PMF and SMF) have been considered very similar for a long time, recent studies highlighted several differences in both clinical features and molecular characteristics. Related to the clinics, specific prognostic models were developed for risk prediction in SMF patients, in particular MYSEC-PM (Myelofibrosis Secondary to PV and ET -Prognostic Model). Taking this into account, it would be important to include, in Supplemental Table 2, the frequency of driver mutations (affecting JAK2, MPL ad CALR) harbored by patients included in the study. Moreover, it is crucial to specify the frequency of Post-PV MF and Post-ET MF and not only the frequency of SMF. Finally, I warmly suggest to consider MYSEC-PM prognostic classification for SMF patients. 4. In paragraph "JAK2 germline risk haplotype confers elevated risk of cis JAK2V617F mutations" authors evaluate the association between a previously described JAK2 germline risk haplotype (JAK2 46/1) and the presence of JAK2V617F mutation. The authors report that the frequency of JAK2V617F in the study cohort is about 60% and is more frequent in SMF patients compared to PMF. This is obviously expected because it is already known that almost the totality of PV patients (therefore PostPV-MF patients) harbor JAK2V617F driver mutation. Which is the frequency of the JAK2 46/1 risk haplotype in the study cohort? Since the frequency of JAK2V617F mutation is greater in PostPV-MF patients compared to PMF and PostET-MF ones, would it be possible to evaluate the correlation between the presence of this mutation and the germline risk haplotype separately for each disease?
5. The authors demonstrate that JAK2V617F is frequently observed in cis with the JAK2 46/1 risk haplotype. Please can the authors provide any observation about the co-occurrence of JAK2V617F mutation and the rs7851556 variant? Patients carrying the rs7851556 are more likely to have JAK2V617F mutation?
6. JAK2V617F was the first driver mutation discovered in 2005 in MPN patients. It is the most frequent driver mutation affecting the vast majority (95%) of PV patients and almost 60% of PMF and ET ones. In the present study authors report that the interaction between germline and somatic variants in JAK2 contributes to the clonal expansion observed in MF patients. Can the authors discuss the possible involvement of the same mechanisms in other MPNs (PV and ET)? Can it be generalized that the presence of the risk haplotype rs7851556 is associated with an increased risk of acquiring a JAK2V617F mutation rather than the development of myelofibrosis? 7. Authors says that the presence of JAK2 risk haplotype, JAK2V617F mutation and 9p24.1 mosaic chromosomal alterations confer increased JAK2 activity. Can the authors provide any evidence or discussion on how germline haplotype might be able to influence protein activity, since the novel identified risk variant and those previously described are intronic variants that are not located in promoter region? 8. Authors report that relative telomer length is inversely correlated with the increased JAK2V617F clonal fraction. As said by the authors, JAK2V617F mutated patients represent only about 60% of patients included in their study cohort. Can telomer length be correlated with clonal expansion even in cases where the MPN driver mutation is represented by CALR or MPL variants?
Minor points -Line 94: Authors say that primary myelofibrosis (PMF) patients "had an average time from diagnosis to transplant of 63.5 months (median= 25.1, IQR=9.0-88. 3).", then, in line 96 it is reported that secondary myelofibrosis (SMF) "had longer average time from diagnosis to transplant (122.8 months vs. 36.8 months)", it is not clear why the two mean values reported for primary myelofibrosis are different.

Author response
We thank the Reviewers for their time and insightful comments. We have crafted a revision that we hope will be acceptable to the Reviewers. This is an issue we would like to comment on that slightly alters the number of JAK2 mutations, identified in deeper QC analyses due to the past comments of our Reviewers. During our original targeted PacBio sequencing and JAK2 V617F mutation calling, we removed 54 individuals in the QC process based on low sequencing depth (total CCS reads < 1000; N= 1), having more than two germline haplotypes (N= 4), or having JAK2 V617F mutations called on more than one germline haplotype (N= 49). We resequenced 54 individuals to more thoroughly interrogate JAK2 V617F mutations within our myelofibrosis cases. One individual did not have sufficient DNA for resequencing, so this resequencing effort was conducted on 53 individuals that previously failed QC. Of the 53 individuals that originally failed our sequencing QC: • N= 5 were identified to have the JAK2 V617F mutation on two germline haplotypes • N= 2 were identified to have 3 germline haplotypes • N= 46 passed all QC steps and were added to our downstream analyses We retained the 46 individuals that passed the QC steps in the downstream mutation analyses and removed the above 7 individuals who had JAK2 V617F mutations on multiple haplotypes or had >2 germline haplotypes (i.e., potential evidence for sample contamination).
Our new mutation analyses now include a total of 924 (878 + 46) myelofibrosis subjects with the JAK2 V617F mutation identified in 562 (60.8%) individuals. We have updated the Results, Online Methods, and relevant tables and figures to reflect this work, and there were no substantive changes in the overall conclusions of our manuscript based on these updates.
Interestingly, we identified 5 individuals with the JAK2 V617F mutation on two germline haplotypes in both sequencing runs. This may be evidence that these individuals acquired two independent JAK2 V617F mutations. We have created a table of these 5 individuals (Supplemental Table 6) which details their germline haplotypes, and number of reads that carry the JAK2 V617F mutation. These results have been added to the manuscript as well, but out of caution these 5 individuals were not included in the downstream mutation analyses. Within the Results section: "Interestingly, during our JAK2 V617F mutation calling (Online Methods), we identified 5 individuals with evidence of the somatic mutation potentially acquired independently on both germline haplotypes (Supplemental Table 6) which were replicated in independent sequencing runs on new libraries. Future studies are needed to further explore the frequency of independent JAK2 V617F mutations on both germline haplotypes in MF cases." Within the Discussion section: "Using independent sequencing runs, we identified 5 individuals with evidence of acquiring independent JAK2 V617F mutations on both germline haplotypes. The frequency and consequences of independent JAK2 V617F mutations on both germline haplotypes should be further studied." Below, please find our responses to each of the Reviewer comments, as well as where the corresponding revisions can be found. All changes have been tracked with highlighted text. Polycytemia vera (PV), essential thrombocythemia (ET) and myelofibrosis (MF) constitute the group of myeloproliferative disorders (MPN). These disorders are grouped together is that the differential diagnosis between them is often not clear-cut, and that MF often evolves from PV or ET. As a result, these disorders are expected to show, and are known to shown, a genetic overlap, both at the germline and somatic level.

Supplemental
Brown et al report a GWAS on 827 MF cases vs 4135 controls. They detect six loci at P < 5x10-8, four of which are known (JAK2, TERT, IFT80, and TET2). Further, they detect two borderline-significant signals (at HLA and TP53), and an enrichment of low MF GWAS Pvalues among variants previously reported to associate with telomere length. The paper is well written, but the findings are not novel.
The identified JAK2 lead variant rs7851556 is in perfect LD with the well-known JAK2 MPN risk variant, which has been reported by several authors since 2009 and is known to predispose for acquisition of somatic JAK2 V617F mutation in cis on the same chromosome (PMID 19287385, PMID19287384, PMID19287382 and PMID33057200). The higher expression of JAK2 in blood is also known.
The TERT, IFT80 and TET2 variants are also known MPN variants. The pleiotropy between MPN and telomere length has been demonstrated previously (PMID33057200), and the identified TERT variant was recently reported to influence blood CD34+ stem cell levels (PMID 35007327).
The HLA and TP53 associations are borderline-significant (order of P = 10-9; thus not Bonferroni-significant), and are not subjected to replication analysis. The latter two should therefore be reported with caution, if at all.
In conclusion, this study mainly replicates things that are already known. The fact that the authors underplay this in the abstract and the text is annoying and, to my mind, almost dishonest. The authors do have a point in that their cohort comprises a larger number of MF cases than previous MPN studies, which are dominated by PV and ET. Accordingly, the way forward seems to be to rewrite the paper such that it clearly states what it is already known, and what is novel, including the precise relationship between their findings and previously published (including for example precise LD statistics between identified and previously reported lead variants). The authors also need to remove all kinds of premature speculation, for example "our results indicate clonal expansion of JAK2 is facilitated by germline variation associated with longer telomere length" and "The findings have translation implications, highlighting the potential role for telomerase inhibitors as treatment in high-risk individuals".

RESPONSE:
Thank you for your thoughts and comments on our manuscript. We are in agreement that various results presented in our manuscript replicate past findings from MPN studies and have tried to clarify this point in several sections of our manuscript including the updated Abstract, Results and Discussion. Specifically, in our GWAS, we have included in the Results the precise LD statistics between the identified and previously reported lead variants as well as cited references. Our findings are the first evidence suggesting MF is also associated with these MPN loci and they are not predominantly driven by PV or ET subtypes. We have further included a comment that the TERT variant was reported to be associated with blood CD34+ cell levels. Within the Results section: "Previously identified in MPN, 20 this intronic variant is located in TERT, which encodes telomerase, the reverse transcriptase that extends telomeric DNA repeats, and has been associated with CD34 + to CD45 + ratio. 25 " For the acquisition of JAK2 V617F mutations, we have cited the relevant literature mentioned by Reviewer 1. We updated the Results and Discussion to clarify that this association has been previously reported. Within the Results section: "Furthermore, when examining phase information, we observed a strong cis relationship between the germline risk haplotype and JAK2 V617F mutations acquired on the same risk haplotype (binomial P= 1.23×10 -26 ) (Supplemental Table 5), as previously observed in MPN patients. 12-14 " Within the Discussion section: "We observed that individuals with the germline JAK2 risk haplotype were predisposed to acquiring a somatic JAK2 V617F mutation in cis, as previously reported, 12-14 and mCAs lead to preferential over-representation of this risk haplotype containing the JAK2 V617F mutation. 44 " For the telomere length findings, we compared our germline telomere length findings to the manuscript results (PMID33057200) mentioned by the reviewer. Within the Results section: "A marginally significant genome-wide genetic correlation was observed between telomere length and MF (LDSC r= 0.23, s.e.m= 0.11, P= 0.038), similar in magnitude to what was reported for telomere length and MPN (LDSC r= 0.19, s.e.m= 0.09, P= 0.037). 20 " Our study provides notable observations. First, our study is distinctive by performing integrated analysis of a large cohort (N=933) of clinically well annotated MF cases that included multiple layers of genomic data including genome-wide genotyping for germline risk analysis, estimating genetically predicted telomere length and detection of chromosomal alteration (mCAs), regional sequencing, and measured telomere length. Using this approach, we have demonstrated that the JAK2 germline risk haplotype predisposes to the somatic acquisition of cis JAK2 V617F mutations, and mCAs acquired independently over the same region leading to preferential overrepresentation of this risk haplotype containing the JAK2 V617F mutation. This possibly predisposes to increased clonal expansion and associated telomere length shortening. Public posting of this MF germline array data in dbGaP promotes new opportunities for discovery and accelerates genetic research of MF susceptibility.
Second, our GWAS of MF replicated germline susceptibility loci identified in past MPN studies, confirmed associations with the MF subtype of MPNs, and identified two novel signals at 6p21.32 (HLA-DRB9) and 17p13.1 (TP53). These new loci reach the accepted multiple testing genome-wide significance level of P<5×10 -8 , a level of statistical support that is commonly replicated in independent samples. The novel MF loci include biologically relevant candidate genes related to immunity and tumor suppression, both of which are potentially relevant to MF etiology. We acknowledge no independent replication of these new loci and based on Reviewer 1's comment, we have added additional text about the need for further replication. Within the Results section: "Future studies are warranted to validate these two MF germline susceptibility loci." Third, our study represents the first large, comprehensive genome-wide assessment of mCAs in MF patients. We found highly elevated rates of mCAs in MF cases relative to population-based controls. We replicated prior observations that mCAs preferentially expand the JAK2 MF risk haplotype and also identified a novel enrichment of mCAs at other MF GWAS susceptibility loci. Based on comments made by Reviewer 3, we further investigated mCA enrichment over the other MPN driver mutation positions (see new Supplemental Table 10 below) and observed enrichment for mCAs at these common MPN driver mutations as well, suggesting mCAs may also selectively expand other non-JAK2 MPN driver mutations. Finally, our study provides new insights into the importance of inherited telomere length as a likely risk factor for MF susceptibility and details the impact of JAK2-related clonal expansion on subsequent TL shortening. We used polygenic risk score (PRS) of recently published germline TL variants to demonstrate an association of longer inherited TL to MF risk. We found that the PRS was not only associated with increased MF risk but was further associated with both the presence of JAK2 V617F mutations as well as mCAs. These analyses suggest longer telomere length may afford cells with JAK2 V617F mutations or mCAs the ability to clonally expand to detectable clonal fractions, after acquiring these mutations. As reported previously and replicated in our MF study, measured rTL was not associated with JAK2 V617F mutation presence, but was inversely associated with JAK2 V617F clonal fraction (PMID: 23542632). Our results suggest mCAs are a predominant driver of this association as individuals with mCAs had the highest level of telomere length attrition, suggesting this group of MF patients may be at particularly high risk for progression and could represent a subset of MF cases more likely to benefit from telomerase inhibitors; although future studies are needed to test this hypothesis.

Reviewer #2 (Remarks to the Author): Expert in MF and MPN molecular genetics and genomics
The authors provide a comprehensive genomic analysis on very a large number of myelofibrosis patients compared with an even 5 times larger number of genetically-matched, cancer-free individuals serving as controls. The main methods comprise i) SNP genotyping to identify germline susceptibility loci, ii) calling for mosaic chromosomal alterations to determine copy number state, cellular fraction, and chromosomal region of the respective events, and iii) qPCRbased measurement of the relative telomere leukocyte length for association of identified SNPs with myelofibrosis risk and proliferation. The findings are in line with previous data on this topic and extend our understanding of the etiology of myelofibrosis due to new insights how germline variants and somatic alterations interact regarding dysregulation of JAK2 activity and thus clonal expansion in myelofibrosis patients.
In addition to known susceptibility loci in JAK2, TERT, TET2, and IFT80, the authors identified two novel germline loci in the genes HLA-DRB9 and TP53 predisposing for myelofibrosis acquisition. Furthermore, the autosomal location of both recurrent CNLOH events (mainly on chromosome 9p) and copy-number alterations (mainly recurrent loss events on chromosome 20q and 13q) is valid compared with former SNP array studies in myeloproliferative neoplasm patients. Including their data on telomere length, the authors finally provide a model in which the germline JAK2 risk haplotype predisposes to somatic acquisition of JAK2 V617F, subsequently causing clonal expansion and cell proliferation, and finally leading to accelerated telomere length shortening in myelofibrosis patients. These data and the proposed model justify the title of the manuscript. In summary, the presented manuscript is methodologically sound, very elaborate, thoughtfully written, and provides significant novelty to the field as well as for the readership of Nature Communications.

RESPONSE:
Thank you for your review of our manuscript. We are glad you found the manuscript methodologically sound and novel. Please find our responses to your remarks below.
Minor remarks: -The study is actually based on data from 827 myelofibrosis patients after quality control and matching with cancer-free individuals, not on 937 cases as stated in the Abstract. Thank you for pointing this out. We have reworded the sentence in the Abstract to be "up to 933 myelofibrosis cases" as this is the largest N used for any analysis.
-The paper is in general a bit tough to read due to the high number of very particular abbreviations that are used. Please check for readability. We have removed superfluous abbreviations where appropriate. We have also included an Abbreviations section at the end of the manuscript which we hope increases readability.
-Dynamic International Prognostic Scoring System = DIPSS (not DIPPS) Thank you for drawing our attention to this typo. We have updated 'DIPSS' throughout the text and within all Figures and Tables.

Reviewer #3 (Remarks to the Author): Expert in MF and MPN genomics
In the present work authors perform an integrated genomic analysis on a wide cohort of primary and secondary myelofibrosis (PMF and SMF) patients in order to characterize the heritable genetic component of these neoplasms. As a result, authors identify six novel germline susceptibility loci, and among them two variants emerged, the first in 9p24.1 locus within an intron of the frequently mutated JAK2 gene, and the second located within an intron of TERT gene. Authors show a correlation between the previously described JAK2 46/1 risk haplotype and the occurrence of JAK2V617F mutation or mosaic chromosomal alterations involving 9p24.1. According to their hypothesis, the presence of both germline variants, together with the acquisition of JAK2V617F mutation and somatic mosaic chromosomal alterations, contributes to the increased JAK2 activity and clonal expansion as demonstrated by the reduction of telomere length. The presented results are significant and original, but I have some concerns about data interpretation.

RESPONSE:
Thank you for your review of our manuscript. Please find our detailed responses to your comments below.
Major points 1. Myelofibrosis (MF) belongs to chronic Philadelphia-Negative Myeloproliferative Neoplasms (MPNs); it is generally assumed that the malignant transformation involves the myeloid lineage while lymphoid cells do not belong to the neoplastic clone, indeed in most cases these cells do not harbor the driver mutation observed in patients. Authors performed their analysis by studying whole blood specimens or PBMNCs therefore including in their analysis both neoplastic myeloid cells belonging to the expanded malignant clone (granulocytes, monocytes and circulating CD34+cells) and lymphoid cells probably representing a non-malignant cell fraction. Does the presence of both clonal and non-clonal cells in the analyzed sample might influence analysis results? Can the variable frequency of neoplastic cells in the analyzed sample influence analysis results? In such a complex and variable scenario how were germline variants distinguished from somatic ones?
Thank you for the thoughtful questions. As you mentioned, our analyses were performed using DNA derived from either whole blood or PBMCs from our myelofibrosis patients. The mixture of mutated myeloid cells and normal lymphoid cells will not affect the germline genotyping data as the clustering algorithms we use in the Illumina germline genotype calling are tuned to call genotyped probes that follow Mendelian proportions for inherited homozygous and heterozygous variants. In addition, any somatic point mutations that arise in the MF cases would unlikely occur at the same position as a genotyping probe as the arrays only target 500K markers (0.02%) across the 3B base pair genome. To further confirm that neoplastic expansion of myeloid MF clones did not alter germline analysis results, we performed sensitivity analyses at the JAK2 locus by removing any individual with a JAK2 mCAs and the signal remained genome-wide significant (P=1.99×10 -10 ).
For our mCA calling, we utilized the same genotype data used for our GWAS. When calling mCAs, we scan for large, contiguous genomic stretches with deviations in signal intensity and allelic imbalances of heterozygous variants followed by QC steps that plots and visually inspects each potential mCA event. Only events that pass manual review were included in our analyses. As suspected, having a mixture of both myeloid and lymphoid cells does affect the estimation of clonal fraction of mCA events; biasing the estimated cellular fractions to lower levels based on the normal lymphoid cells in the evaluated DNA. As mCA detection limits are around 3-5% dependent on size and genomic location, it is also possible we miss low cell fraction mCA events present in myeloid cells due to the dilution from lymphoid cells. Most detected mCAs in our study were at high cell fraction (median= 36%), and therefore, we expect that influences of dilution from normal lymphoid cells had a minimal impact on detection of events in our study.
For our JAK2 V617F mutation calling using PacBio sequencing, we sequenced at a high depth (median coverage= 6,872X) so were unlikely to miss detection of low cell fraction JAK2 V617F mutations. We do however expect the JAK2 V617F cell fractions to have similar biasing of estimated cell fractions to lower values. As the same DNA samples were used for genotyping and sequencing, any dilution of the cell fraction will be controlled for when doing comparisons of cell fraction in the sequencing vs. genotyping data. Within the Results section we have added: "The estimated JAK2 V617F allelic fractions and mCA cellular fractions may be lower than the true somatic fraction in the actual diseased myeloid cells because we used whole blood DNA from most of the patients. However, this would not affect the ratio of JAK2 V617F allelic fraction to mCA cellular fraction." 2. The present study is focused on MF that can be either primary or secondary, when it develops in patients who have a previous diagnosis of polycythemia vera (PV) or essential thrombocythemia (ET). Even if these two conditions (PMF and SMF) have been considered very similar for a long time, recent studies highlighted several differences in both clinical features and molecular characteristics. Related to the clinics, specific prognostic models were developed for risk prediction in SMF patients, in particular MYSEC-PM (Myelofibrosis Secondary to PV and ET -Prognostic Model). Taking this into account, it would be important to include, in Supplemental Table 2, the frequency of driver mutations (affecting JAK2, MPL ad CALR) harbored by patients included in the study. Moreover, it is crucial to specify the frequency of Post-PV MF and Post-ET MF and not only the frequency of SMF. Finally, I warmly suggest to consider MYSEC-PM prognostic classification for SMF patients.
Thank you for these comments. We do not have complete clinical information on driver mutations (MPL and CALR) in all MF cases. Here, we performed targeted sequencing of the JAK2 region to call JAK2 mutations to follow-up on the JAK2 MF susceptibility region from the GWAS analysis. We have an on-going sequencing project in this MF patient population where we will be able to evaluate those mutations in the future.
We have added the frequency of both Post-PV MF and Post-ET MF to Supplemental Table 1. This information is available for each of our analytical samples. Additionally, as detailed below in subsequent responses, we have performed new analyses which stratify SMF into Post-PV MF and Post-EV MF.
Thank you for your suggestion to use MYSEC-PM prognostic classification within SMF patients. We will use this classification in our future prognostic study after we complete the sequencing.  Table 3 and new Supplemental Figure 3 which detail these SMF stratified GWAS results. Across each loci identified in our main GWAS study, all point estimates remain in the same direction in both the post-PV MF and post-ET MF analyses. Interestingly, the JAK2 loci still reaches genome-wide significance within the post-PV MF analysis, which is consistent with prior reports of higher JAK2 involvement in patients with PV. We have added these new GWAS findings to the Results section and have included the new Supplemental Table 3 and new Supplemental Figure 3 in the manuscript supplement.  Figure 3. Stacked Manhattan plots from the genome-wide association study stratified by post-polycythemia vera myelofibrosis (119 cases, 595 controls; Top plot) and post-essential thrombocythemia myelofibrosis (139 cases, 695 controls; Bottom plot). The association -log10 P-values are plotted for each tested genetic variant on the y-axis and chromosomal position on the x-axis. The nearest gene for each identified locus is labeled. The red line indicates the genome wide significance threshold (5×10 -8 ).

Supplemental
4. In paragraph "JAK2 germline risk haplotype confers elevated risk of cis JAK2V617F mutations" authors evaluate the association between a previously described JAK2 germline risk haplotype (JAK2 46/1) and the presence of JAK2V617F mutation. The authors report that the frequency of JAK2V617F in the study cohort is about 60% and is more frequent in SMF patients compared to PMF. This is obviously expected because it is already known that almost the totality of PV patients (therefore PostPV-MF patients) harbor JAK2V617F driver mutation. Which is the frequency of the JAK2 46/1 risk haplotype in the study cohort? Since the frequency of JAK2V617F mutation is greater in PostPV-MF patients compared to PMF and PostET-MF ones, would it be possible to evaluate the correlation between the presence of this mutation and the germline risk haplotype separately for each disease?
We have added the 46/1 haplotype frequency within our MF cases to the paragraph in question. We provide this for the overall patient population as well as stratified by PMF, post-PV MF, and post-ET MF.
Within the Results section: "… with 634 (68.61%) individuals carrying the JAK2 46/1 germline risk haplotype, and post-polycythemia vera MF (88.15%) having a higher frequency than both primary MF (65.98%) and post-essential thrombocythemia MF (66.45%)." We have also performed new analyses that stratify the association between the JAK2 V617F mutation and germline haplotype within PMF, post-PV MF, and post-ET MF. Please see the tables below, for your reference, that detail these stratified results. We see that the overall cis relationship is still observed when stratified by each type of MF.
Within the Results section we now provide these new results: "These results were consistently observed when stratified by type of MF: primary MF (binomial P= 1.86×10 -13 ), postpolycythemia vera MF (binomial P= 4.88×10 -10 ), and post-essential thrombocythemia MF (binomial P= 1.90×10 -4 )." The reviewer's assumption is correct here. Our identified lead GWAS SNP (rs7851556) is in very high LD with the 46/1 risk haplotype. Below please find Supplemental Table 4 which details these new results.
We have also added the following to the Results section: "MF cases carrying the risk allele (T) of rs7851556 (our lead GWAS SNP) were more likely to acquire a somatic JAK2 V617F mutation (P= 9.41×10 -14 ) (Supplemental Table 4 6. JAK2V617F was the first driver mutation discovered in 2005 in MPN patients. It is the most frequent driver mutation affecting the vast majority (95%) of PV patients and almost 60% of PMF and ET ones. In the present study authors report that the interaction between germline and somatic variants in JAK2 contributes to the clonal expansion observed in MF patients. Can the authors discuss the possible involvement of the same mechanisms in other MPNs (PV and ET)? Can it be generalized that the presence of the risk haplotype rs7851556 is associated with an increased risk of acquiring a JAK2V617F mutation rather than the development of myelofibrosis?
As shown in our response to your above comment, we consistently observed the cis relationship between the 46/1 germline risk haplotype and the JAK2 V617F mutation for each type of MF (PMF, Post-PV MF, and Post-ET MF). Also, within our (new) stratified GWAS, we show that rs7851556 is associated with increased risk of PMF, Post-PV MF, and Post-ET MF, although the results for Post-ET MF do not reach genome-wide significance, probably due to the smaller sample size.
As rs7851556 is a strong LD tag for the 46/1 haplotype, we hypothesize that rs7851556 is not a specific risk factor for MF, but rather MPN as a whole, most likely due to the increased risk of acquiring a JAK2 V617F mutation, as shown previously (PMID 19287382, PMID19287384, and PMID 19287385).
7. Authors says that the presence of JAK2 risk haplotype, JAK2V617F mutation and 9p24.1 mosaic chromosomal alterations confer increased JAK2 activity. Can the authors provide any evidence or discussion on how germline haplotype might be able to influence protein activity, since the novel identified risk variant and those previously described are intronic variants that are not located in promoter region?
We agree with the reviewer that the identified germline risk variant (rs7851556) is an intronic variant, thus it is likely not a functional variant. Our eQTL and TWAS results indicate that the germline risk haplotype is associated with elevated JAK2 expression. It is likely that a variant in high LD with our identified risk variant is functional.
There is a 50+ Kb LD block within the JAK2 region with our identified risk variant (plot given below). Using LDproxy (https://ldlink.nci.nih.gov/?tab=ldproxy), variants in LD with our risk variant can be evaluated for predicted functionality. There are several variants in high LD with the lead GWAS variant that demonstrate functional potential. Within the Discussion section we have further expanded on this point: "While we were able to connect germline susceptibility and somatic mutations in cross-sectional data from MF patients, future longitudinal assessment will be key in follow-up of our study. Likewise future studies characterizing germline functional variation are needed to better understand how the germline JAK2 MF susceptibility locus leads to altered JAK2 expression and acquisition of JAK2 V617F mutations." 8. Authors report that relative telomer length is inversely correlated with the increased JAK2V617F clonal fraction. As said by the authors, JAK2V617F mutated patients represent only about 60% of patients included in their study cohort. Can telomer length be correlated with clonal expansion even in cases where the MPN driver mutation is represented by CALR or MPL variants? Thank you for this question. From our rTL analyses we observed: 1. An inverse association between measured rTL and presence of any autosomal mCA (OR= 0.14, 95% CI= 0. Together, these results suggest that it is not the JAK2 V617F mutation that is leading to clonal expansion, rather it is the mCAs, many of which selectively retain or duplicate JAK2 V617F mutations, that promote rapid clonal expansion of mutated clones resulting in significant reductions in telomere length. We demonstrated in Supplemental Table 7 that each GWAS susceptibility locus showed enrichment for mCAs compared to age and sex-matched cancer-free individuals in the UK Biobank, suggesting mCAs could clonally expand MF risk conferring alleles at susceptibility loci. Based on your suggestion, we further investigated mCA enrichment over the other MPN driver mutation positions (see new Supplemental Table 10). These results suggest that mCAs may also clonally expand other MPN driver mutations, but future studies with MPN and CALR sequencing data are needed to fully test this hypothesis.
We have added these findings to the Results and included the new Supplemental Table 10 in the manuscript.  Table 4 we display JAK2 V617F mutation status by JAK2 genotype (rs7851556) status, with all the genotype counts. The distance between our lead GWAS variant (rs7851556) and the JAK2 V617F mutation is approximately 50Kb (chr9:5022807-5073770) in length. Repetitive regions in this area lead to several challenges with sequencing. To avert these obstacles, we sequenced a 7Kb region around the JAK2 V617F mutation which included variants in the 46/1 haplotype. We demonstrated a cis relationship between the 46/1 haplotype and JAK2 V617F mutation and as these variants in the 46/1 risk haplotype are in high LD with our lead GWAS SNP (R 2 >0.93, see below plot), a cis relationship between rs7851556 and the JAK2 V617F mutation is established.

Supplemental
Within the above plot, our lead GWAS variant (rs7851556) is colored blue. The star represents the JAK2 V617F mutation, and the red box is the 7Kb region we sequenced.
6. The authors answered the question. Related to this, I think that in discussion there is no clear reference to the identified lead variant rs7851556. I think that a reference to rs7851556 should be included in line 338 339, otherwise it seems to be referred only to previously described risk haplotype.
Thank you for your comment. We have added a reference to the identified lead GWAS variant within the section mentioned. Within the Discussion: "We observed that individuals with the germline JAK2 risk haplotype tagged by rs7851556 were predisposed to acquiring a somatic JAK2 V617F mutation in cis, as previously reported…" 7. The authors agree with me saying that rs7851556 germline risk variant is not likely to be a functional variant since it is located in an intronic region. Using LDproxy they identified many other variants in LD with the identified risk one. None of these variants have been demonstrated to be functionally relevant. This is one of the main limit of the study, since the title of the paper points toward an involvement of genetic variants in clone expansion determination without demonstrating a true functional effect of the described variant. It has been demonstrated by authors that the presence of risk variant rs7851556 correlates with the presence of JAK2V617F mutation. It is known that JAK2V617F mutation correlates with increased JAK2 expression (PMID: 24740812). Would it be possible that the correlation between risk variant and gene expression is determined by the effect of JAK2V617F mutation itself? Related to this, in discussion Line 343 -345 authors says "Increased JAK2 activity conferred by the germline JAK2 risk haplotype, JAK2V617F activating mutation, and 9p24.1 mCAs promotes a cellular phenotype characterized by increased clonal expansion." This represents an author assumption, indeed they observed increased expression of JAK2 correlating with the presence of rs7851556 risk variant, not an increased activity of JAK2. I believe authors should state clearly this is their leading hypothesis. As they say in following lines a functional demonstration on how these factors cooperate in promoting clonal expansion is missing. In line 369-370 authors say that flow-FISH might be useful to test their hypothesis, did they have any preliminary results that associate telomer length with clonal expansion and the presence of JAK2 mCAs in MF patients?
Our study deeply characterized germline and somatic profiles (including JAK2 V617F mutations and mCAs) in a large population of MF cases. We localized germline variation at 9p24.1 to the tagging marker rs7851556 and provided evidence for how this germline variation influences somatic profiles in MF cases as well as JAK2 expression levels in normal circulating leukocytes (GTEx). As GWAS identifies regions of germline variation important for MF risk, we make no claims of function for the rs7851556 variant as it likely tags other germline variation with functional relevance. We agree with the Reviewer that rs7851556 is likely not functional and recommended in the Discussion section further studies to identify functionally relevant variation in the region.
rs7851556 could have effects through promoting JAK2 expression (as observed in our eQTL analyses in normal whole blood) as well as through leading to susceptibility of the JAK2 V617F and mCA changes that have downstream impacts on JAK2 activity and expression (PMID: 24740812), but extensive functional work would be needed and teasing out impacts of rs7851556 vs JAK2 V617F is challenging due to high correlation. As we do not have laboratory capacity to assess the functional mechanism by which the risk variant alters JAK2 expression, our analyses and inferences lean heavily on public resources and prior published reports.
For the Discussion section sentence in question, we agree that this is our hypothesis. To clarify, we have modified the text as follows: "Altered JAK2 activity conferred by the germline JAK2 risk haplotype, JAK2 V617F activating mutation, and 9p24.1 mCAs could lead to a cellular phenotype characterized by increased clonal expansion" We do not have flow-FISH data to present with regards to telomere length. Functional work is beyond the intended scope of our study's analytical aims; however, existing published reports along with evidence from public data lend strong support for our proposed etiologic framework 8. The authors answered the question and identified mCAs affecting other MPN driver mutations positions that may contribute to clonal expansion in JAK2 negative patients. Did the authors observed any association with reduced rTL suggesting clonal expansion also in patients with mCAs affecting CALR and MPL genomic regions?
In our analyses, we observed a strong relationship in which increasing mCA cellular fraction was associated with a substantial decrease in measured rTL (β= -0.57, 95% CI= -0.74 --0.39, P= 4.76×10 -10 ). These analyses were conducted for any mCA. We did not perform mCA cellular fraction analyses subset to mCAs spanning CALR or MPL mutations due to the limited sample size (N=25 and 31, respectively), but our overall analyses suggest that the inverse association between rTL and cellular fraction is consistent regardless of mCA position.
Minor Comments -Line 173 remove "Also", JAK2V617F mutation is a driver mutation in MPN. Removed.